Evaluating Complement-Modifier Distinctions in a Semantically Annotated Corpus
نویسندگان
چکیده
We evaluate the extent to which the distinction between semantically core and non-core dependents as used in the FrameNet corpus corresponds to the traditional distinction between syntactic complements and modifiers of a verb, for the purposes of harvesting a widecoverage verb lexicon from FrameNet for use in deep linguistic processing applications. We use the VerbNet verb database as our gold standard for making judgements about complement-hood, in conjunction with our own intuitions in cases where VerbNet is incomplete. We conclude that there is enough agreement between the two notions (0.85) to make practical the simple expedient of equating core PP dependents in FrameNet with PP complements in our lexicon. Doing so means that we lose around 13% of PP complements, whilst around 9% of the PP dependents left in the lexicon are not complements.
منابع مشابه
Semantically Enriched Models for Modal Sense Classification
Modal verbs have different interpretations depending on their context. Previous approaches to modal sense classification achieve relatively high performance using shallow lexical and syntactic features. In this work we uncover the difficulty of particular modal sense distinctions by eliminating both distributional bias and sparsity of existing small-scale annotated corpora used in prior work. W...
متن کاملDeveloping linguistic theories using annotated corpora
This paper aims to carve out a place for corpus research within theoretical linguistics and psycholinguistics. We argue that annotated corpora naturally complement native speaker intuitions and controlled psycholinguistic methods and thus can be powerful tools for developing and evaluating linguistic theories. We also review basic methods and best practices for moving from corpus annotations to...
متن کاملExtending a Verb-lexicon Using a Semantically Annotated Corpus
This paper describes the association of an hierarchical verb lexicon, VerbNet, with a semantically annotated corpus, the Proposition Bank. It focuses on comparisons of the syntactic coverage of the two resources as a method of evaluating their correspondence. Both VerbNet and PropBank have explicit syntactic frames associated with each verb, which allowed an automatic mapping between the two re...
متن کاملBuilding a semantically annotated corpus of clinical texts
In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic ann...
متن کاملEvaluating Cross-Language Annotation Transfer in the MultiSemCor Corpus
In this paper we illustrate and evaluate an approach to the creation of high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the assumption that if a text in one language has been annotated and its translation has not, annotations can be transferred from the source text to the target using word alignment as a bridge. Th...
متن کامل